Multiset- Valued Linear Index Grammars: 
Imposing Dominance Constraints on Derivations 
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Abstract 

This paper defines multiset-valued linear index gram- 
mar and unordered vector grammar with dominance 
links. The former models certain uses of multiset- 
valued feature structures in unification-based for- 
malisms, while the latter is motivated by word order 
variation and by "quasi-trees" , a generalization of trees. 
The two formalisms are weakly equivalent, and an im- 
portant subset is at most context-sensitive and polyno- 
mially parsable. 
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Early attempts to use context-free grammars (CFGs) as 
a mathematical model for natural language syntax have 
largely been abandoned; it has been shown that (un- 
der standard assumptions concerning the recursive na- 
ture of clausal embedding) the cross-serial dependencies 
found in Swiss German cannot be generated by a CFG 
(Shieber, 1985). Several mathematical models have 
been proposed which extend the formal power of CFGs, 
while still maintaining the formal properties that make 
(3J[)CFGs attractive formalisms for formal and computa- 
tional linguists, in particular, polynomial parsability 
Q j and restricted weak generative capacity. These Blathe- 
rs matical models include tree adjoining grammar (TAG) 
1 (Joshi et al., 1975; Joshi, 1985), head grammar (Pollard, 
1984), combinatory categorial grammar (CCG) (Steed- 
man, 1985), and linear index grammar (LIG) (Gaz- 
dar, 1988). These formalisms have been shown to be 
weakly equivalent to each other (Vijay-Shanker et al., 
1987; Vijay-Shanker and Weir, 1994); we will refer to 
them as "LIG-equivalent formalisms" . LIG is a vari- 
ant of index grammar (IG) (Aho, 1968). Like CFG, IG 
is a context-free string rewriting system, except that 
the nonterminal symbols in a CFG are augmented with 
stacks of index symbols. The rewrite rules push or pop 
indices from the index stack. In an IG, the index stack 
is copied to all nonterminal symbols on the right-hand 
side of a rule. In a LIG, the stack is copied to exactly 
one right-hand side nonterminal. 1 
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While LIG-equivalent formalisms have been shown to 
provide adequate formal power for a wide range of lin- 
guistic phenomena (including the aforementioned Swiss 
German construction), the need for other mathemati- 
cal formalisms has arisen in several unrelated areas. In 
this paper, we discuss three such cases. First, captur- 
ing several semantic and syntactic issues in unification- 
based formalisms leads to the use of multiset-valued 
feature structures. Second, word order facts from lan- 
guages such as German, Russian, or Turkish cannot be 
derived by LIG-equivalent formalisms. Third, a gener- 
alization of trees to "quasi-trees" (Vijay-Shanker, 1992) 
in the spirit of D-Theory (Marcus et al., 1983) leads 
to the definition of a new formal system. In this pa- 
per, we introduce two new equivalent mathematical for- 
malisms which provide adequate descriptions for these 
three phenomena. 

The paper is structured as follows. First, we present 
the three phenomena in more detail. We then introduce 
multiset-valued LIG and present some formal proper- 
ties. Thereafter, we introduce a second rewriting sys- 
tem and show that it is weakly equivalent to the LIG 
variant. We then briefly mention some related for- 
malisms. We conclude with a brief summary. 



Three Problems for LIG-Equivalent 

Formalisms 

The three problems we present are of a rather differ- 
ent nature. The first arises from the way a linguis- 
tic problem is treated in a specific type of framework 
(unification-based formalisms). The second problem 
derives directly from linguistic data. The third prob- 
lem is a formalism which has been motivated on in- 
dependent, methodological grounds, but whose formal 
properties are unknown. 

Multiset-Valued Feature Structures 

HPSG (Pollard and Sag, 1987; Pollard and Sag, 1994) 
uses typed feature structures as its formal basis, which 
are Turing-equivalent. However, it is not necessarily 



1 Note that a LIG is not an IG that is linear (i.e., whose 
productions have at most one nonterminal on the right-hand 



side), but rather, it is a context-free grammar with linear 
indices (i.e., the indices are never copied). 



the case that the full power of the system is used in 
the linguistic analyses that are expressed in it. HPSG 
analyses include information about constituent struc- 
ture which can be represented as a context-free phrase- 
structure tree. In addition, various mechanisms have 
been proposed to handle certain linguistic phenomena 
that relate two nodes within this tree. One of these 
is a multiset-valued feature that is passed along the 
phrase-structure tree from daughter node to mother 
node. Multiset- valued features have been proposed for 
the SLASH feature which handles ^-dependencies (Pol- 
lard and Sag, 1994, Chapter 4), and for certain semantic 
purposes, including the representation of stored quan- 
tifiers in a mechanism similar to Cooper-storage. An- 
other use may be the representation of anti-coreference 
constraints arising from Principle C of Binding Theory 
(be it that of (Chomsky, 1981) or of Pollard and Sag 
(1992)). 

It is desirable to be able to assess the formal power 
of such a system, for both theoretical and practical 
reasons. Theoretically, it would be interesting if it 
turned out that the linguistic principles formulated in 
HPSG naturally lead to certain restricted uses of the 
unification-based formalism. Clearly this would repre- 
sent an important insight into the nature of grammat- 
ical competence. On the practical side, formal equiv- 
alences can guide the building of applications such as 
parsers for existing HPSG grammars. For example, it 
has been proposed that HPSG grammars can be "com- 
piled" into TAGs in order to obtain a computationally 
more tractable system (Kasper, 1992), thus sidestep- 
ping the issue of building parsers for HPSG directly. 
However, LIG-equivalent formalisms cannot serve as 
targets for compilations in cases in which HPSG uses 
multiset- valued feature structures. 

Word Order Variation 

Becker et al. (1991) discuss scrambling, which is the 
permutation of verbal arguments in languages such as 
German, Korean, Japanese, Hindi, Russian, and Turk- 
ish. If there are embedded clauses, scrambling in many 
languages can affect arguments of more than one verb 
("long-distance" scrambling). 

(1) ... dafi [den Kuhlschrank]^ bisher noch 
. . . that the refrigerator AC c so far yet 
niemand [t 8 zu reparieren] versprochen hat 
no-one NO M to repair promised has 

. . .that so far, no-one has promised to repair the re- 
frigerator 

Scrambling in German is "doubly unbounded" in the 
sense that there is no bound on the number of clause 
boundaries over which an element can scramble, and 
an element scrambled (long-distance or not) from one 
clause does not preclude the scrambling of an element 
from another clause: 



(2) ... daB [dem Kunden] 8 - [den Kuhlschrank]j 
. . . that the client DAT the refrigerator AC c 

bisher noch niemand t 8 [[ty zu reparieren] 
so far yet no-one NO M to repair 

zu versuchen] versprochen hat 
to try promised has 

. . .that so-far, no-one yet has promised the client to 
repair the refrigerator 

Similar data has been observed in the literature for 
other languages, for example for Finnish by Karttunen 
(1989). Becker et al. (1991) argue that a simple TAG 
(and the other LIG-equivalent formalisms) cannot de- 
rive the full range of scrambled sentences. Rainbow and 
Satta (1994) propose the use of unordered vector gram- 
mar (UVG) to model the data. In UVG (Cremers and 
Mayer, 1973), several context-free string rewriting rules 
are grouped into vectors, as for verspricht 'promises': 

(3) ((S NP nom VP), (VP NP dat VP), 

(VP -> Sinf V), (V -> verspricht) ) 

During a derivation, rules from a vector can be ap- 
plied in any order, and rules from different vectors can 
be interleaved, but at the end, all rules from an instance 
of a vector must have been used in the derivation. By 
varying the order in which rules from different vectors 
are applied, we can derive different word orders. Ob- 
serve that the vector in (3) contains exactly one ter- 
minal symbol (the verb); grammars in which every el- 
ementary structure (vector in UVG, tree in TAG, rule 
in CFG) contains at least one terminal symbol we will 
call lexicalized. 

Languages generated by UVG are known to be 
context-sensitive and semilinear (Cremers and Mayer, 
1974) and polynomially parsable (Satta, 1993). How- 
ever, they are not adequate for modeling natural lan- 
guage syntax. In the following example, (4a) is out since 
there is no analysis in which the moved NP c-commands 
its governing verb, as is the case in (4b). 

(4) a. * . . . daB niemand [dem Kunden] [t 8 

. . . that no-one NO M the client DAT 
zu versuchen] [den Kuhlschrank]j versprochen 
to try the refrigerator AC c promised 

hat [tj zu reparieren] 8 
has to repair 

b. ? ... daB niemand [dem Kunden] [den 
Kuhlschrank]j [t 8 zu versuchen] versprochen hat 
[tj-zu reparieren] 8 

What is needed is an additional mechanism that en- 
forces a dominance relation between the sister node of 
an argument and its governing verb. 

Quasi- Trees 

Vijay-Shanker (1992) introduces "quasi-trees" as a gen- 
eralization of trees. He starts from the observation 
that the traditional definition of tree adjoining gram- 



mar (TAG) is incompatible with a unification-based ap- 
proach because the trees of a TAG start out as fully 
specified objects, which are later modified; in particu- 
lar, immediate dominance relations in a tree need not 
hold after another tree is adjoined into it. In order to ar- 
rive at a definition that is compatible with a unification- 
based approach, he makes three minimal assumptions 
about the nature of the objects used for the representa- 
tion of natural language syntax. The first assumption 
(left implicit) is that these objects represent phrase- 
structure. The second assumption is that they "give 
a sufficiently enlarged domain of locality that allows 
localization of dependencies such as subcategorization, 
and filler-gap" (Vijay-Shanker, 1992, p.486). The third 
assumption is that dominance relations can be stated 
between different parts of the representation. These 
assumptions lead Vijay-Shanker to define quasi-trees, 
which are partial descriptions of trees in which "quasi- 
nodes" (partial descriptions of nodes) are related by 
dominance constraints. Each node in a traditional tree 
(as used in TAG) corresponds to two quasi-nodes, a top 
and a bottom version, such that the top dominates the 
bottom. 

There are two ways of interpreting quasi-trees: ei- 
ther quasi-trees can be seen as data structures in their 
own right; or quasi-trees can be seen as descriptions 
of trees whose denotations are sets of (regular) trees. 
If quasi-trees are defined as data structures, we can 
define operations such as adjunction and substitution 
and notions such as "derived structure" . More pre- 
cisely, we define quasi-trees to be structures consisting 
of pairs of nodes, called quasi-nodes, such that one is 
the "top" quasi-node and the other is the "bottom" 
quasi-node. The top and bottom quasi-node of a pair 
are linked by a dominance constraint. Bottom quasi- 
nodes immediately dominate top quasi-nodes of other 
quasi-node pairs, and each top quasi-node is immedi- 
ately dominated by exactly one bottom quasi-node. For 
simplicity, we will assume that there is only a bottom 
root quasi-node (i.e., no top root quasi-node), and that 
bottom frontier quasi-nodes are omitted (i.e., frontier 
nodes just consist of top quasi-nodes). Furthermore, 
we will assume that each quasi-node has a label, and 
is equipped with a finite feature structure. A sample 
quasi-tree is shown in Figure 1 (quasi-tree a 5 of Vijay- 
Shanker (1992, p.488)). 

We follow Vijay-Shanker (1992, Section 2.5) in defin- 
ing substitution as the operation of forming a quasi-node 
pair from a frontier node of one tree (which becomes the 
top node) and the root node of another tree (which be- 
comes the bottom node). As always, a dominance link 
relates the two quasi-nodes of the newly formed pair. 
Adjunction is not defined separately: it suffices to say 
that a pair of quasi-nodes is "broken up" , thus forming 
two quasi-trees. We then perform two substitutions. 
Observe that nothing keeps us from breaking up more 
than one pair of quasi-nodes in either of two quasi-trees, 
and then performing more than two substitutions (as 
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Figure 1: Sample quasi-tree 

long as dominance constraints are respected); there are 
no operations in regular TAG that correspond to such 
operations. We will say that a quasi-tree is derived if in 
all quasi-node pairs, the two quasi-nodes are equated, 
meaning that they have the same label and the two 
feature structures are unified, and furthermore, if all 
frontier quasi-nodes have terminal labels. The string 
associated with this quasi-tree is defined in the usual 
way. 

We have now fully defined a formalism (if informally) : 
its data structures (quasi-trees), its combination oper- 
ation (substitution), and the notion of derived struc- 
ture. We will call this formalism Quasi- Tree Substitu- 
tion Grammar (QTSG). It can easily be seen that all 
examples discussed by Vijay-Shanker (1992) are deriva- 
tions in QTSG. The question arises as to the formal and 
computational properties of QTSG. 

Multiset-Valued LIG 

In order to find a mathematical model for certain uses 
of multiset- valued feature structures, discussed above, 
we now introduce a multiset- valued variant of LIG. We 
denote by M(A) the set of multisets over the elements 
of A, and we use the standard set notation to refer to 
the corresponding multiset operations. 

Definition 1 A multiset-valued Linear Index 
Grammar ^{j-LIG^ is a 5-tuple (Vn, Vt, V\, P, S), 

where \%, Vt, and V\ are disjoint sets of terminals, 
non-terminals, and indices, respectively; S £ Vn is the 
start symbol; and P is a set of productions of the fol- 
lowing form: 

p : As — > v BiSiVi . . . v n - X B n s n v n 

for some n > 0, A, B\, . . . , B n £ \%, s, s\, . . . , s n mul- 
tisets of members of V\, and vo, ■ ■ ■ , v n £ V^. 

The derivation relation =>• for a {}-LIG is defined 
as follows. Let /?, 7 £ (VnM(Vi) U V T )* , t,ti,...,t„ 
multisets of members of V\, and p £ P of the form 
given above. Then we have 

/3Atj ^=> /3v BitiVi . . . v n _iB n t n v n j 
such thatt = Uf =1 {ti\ Si )Us. If G is a {}-LIG, L(G) = 
{w I S =^g w,w £ V^}. 



Suppose we want to apply rule p to an instance of 
nonterminal A with an index multiset t in a sentential 
form. First, we remove the indices in s from t, then we 
rewrite the nonterminal, then we distribute the remain- 
ing indices freely among the newly introduced nonter- 
minals B\,.. .,B n , creating new multisets, and finally 
we add s 8 - to the new multiset for each 5 8 - , creating the 
new ti. 

The reader will have noticed, and hopefully excused, 
the abuse of notation in this definition, which results 
from mixing set-notation and string-notation. We can 
also define {}-LIG as a pure string-rewriting system 
which does not require the definition of additional data 
structures (the multisets) for the notion of "derivation" 
(see (Rainbow, 1994)). However, the definition pro- 
vided here (using an explicit representation of multi- 
sets) has the advantage of corresponding more directly 
to the intuition underlying {}-LIG and is much easier 
to understand and use in proofs. The issue is purely 
notational. 

We now introduce a restriction on derivations, which 
will be useful later. 

Definition 2 A linearly-restricted derivation in a 
{}-LIG is a derivation g : S =4> w with w £ V£ such 
that: 

1. The number of index symbols added (and hence re- 
moved) during the derivation is linearly bounded by 
\w\. 

2. The number of e-productions used during the deriva- 
tion is linearly bounded by \w\. 

We let Lr(G) = {w | there is a derivation g : 
S =4> w such that g is linearly-restricted}, and we let 
£ R {{}-LIG) = {Lr(G) I G a {}-LIG}. If G is a {}-LIG 
such that Lr(G) = L(G), we say that G is linearly 
restricted. Many of the results that we will show ap- 
ply only to linearly restricted {}-LIGs. However, as we 
will see, all linguistic applications will make use of this 
restricted version. 
Example 1 

The following grammar derives the language 
count-5, where count-5 = {a"b"c"d"e" | n > 0}. 



Let Gi = 


{V N ,V T ,V h P,S) with: 




Vn = 


{S,A,B,C,D,E} 
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{a,b,c,d,e} 
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■» Cc, 


P8--C- 




p 9 : D{s d } — 


-> Dd, 


Pw ■ D 




pn : E{s e } - 


-> Ee, 


Pl2 ■ E 
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► e 



A sample derivation is shown in Figure 2. 



This example shows that £({}-LIG) is not contained 
in £(LIG), since the latter cannot derive COUNT-5. We 



now define two normal forms which will be used later. 
We omit the proofs and refer to (Rainbow, 1994) for 
details. 

Definition3 A {}-LIG G = (V N ,V T ,V h P, S) is in 
restricted index normal form or RINF if all pro- 
ductions in P are of one of the following forms ( where 
A,B£V N , f £ Vj and a£{V T U V N )*): 

1. A — > a 

2. A — >Bf 

3. Af — > B 

Theorem 1 For any {}-LIG, there is an equivalent 
{}-LIG in RINF. 

Definition 4 A {}-LIG G = (Vn, Vt, Vi, P, S) is in 
Extended Two Form (ETF) if every production in 
P has the form As B1S1B2S2, As Bs', or 
A — » a, where A,B\,B2 G Vn, s,s\,S2,s' £ V{ , and 
a £ V T U {s}. 

Theorem 2 For any {}-LIG, there is an equivalent 
{}-LIG in ETF. 

We now discuss some formal properties of {}-LIG. 
For reasons of space limitation, we only sketch the 
proofs; full versions can be found in (Rambow, 1994). 
We start with the weak generative power. We have al- 
ready seen that {}-LIG can generate languages not in 
£(LIG) (and hence not in £(TAG)). We will now show 
that linearly restricted {}-LIGs are at most context- 
sensitive. 

Theorem 3 £ R {{}-LIG) C £(CSG). 

Outline of the proof. We simulate a derivation in a 
linear bounded automaton. The space needed for this is 
bounded linearly in the length of the input word, since 
the number of the symbols that are erased, the index 
symbols and nonterminals that rewrite to e, is linearly 
bounded. ■ 

What sort of languages could a {}-LIG possibly not 
generate? Consider the copy language L = {ww \ w £ 
{a, &}*}, and let us suppose that it is generated by G, a 
{}-LIG. This language cannot be generated by a CFG. 
We therefore know that for any integer M , there are in- 
finitely many strings in L whose derivation in G is such 
that at some point, an index multiset in the sentential 
form contains more than M index symbols (since any 
finite use of index symbols can be simulated by a pure 
CFG). It must be the case that this unbounded multiset 
is crucial in restricting the second half of the generated 
string in such a way that it copies the first half (again, 
since a pure CFG cannot derive such strings). However, 
it is impossible for a data structure like a (multi-)set 
(over a finite index alphabet) to record the required se- 
quential information. Therefore, the second half of the 
string cannot be adequately constrained, and G cannot 
exist. This argument motivates the following conjec- 
ture. 

Conjecture 4 {ww\w <E {a,b}*} is not in £({}-LIG). 



S ^ ^{^a : &b: &d: 5 e} 

ViV\ q< \ 
^\Sa : $b : &d: &e : ^a: &b: &d: &e : ^a: $b : &d: &e ] 



P2 ^ 
P7 ^ 
P3P3P4 



M s a, s a , s a }B{s b , s b , s b }C{s c , s c , s c }D{s d , s d , s d }E{s e , s e ,s e } 
A{s a , s a , s a }B{s b , s b , s b }C{s c , s c }cD{s d , s d , s d }E{s e ,s e , s e } 
aaaB{s b ,s b , s b }C{s c , s c }cD{s d , s d , s d }E{s e , s e , s e } 
aaabbbcccdddeee 

Figure 2: Sample derivation in {}-LIG G\ 



We now turn to closure properties. 

Theorem 5 £({}-LIG) is a substitution-closed full ab- 
stract family of languages (AFL). 

Outline of the proof. Since £({}-LIG) contains all 
context-free languages, it contains all regular languages, 
and therefore it is sufficient to show that £({}-LIG) is 
closed under intersection with regular languages and 
substitution. These results are shown by adapting the 
techniques used to show the corresponding results for 
CFGs. ■ 

Finally, we turn to the recognition and parsing prob- 
lem. Again, we will restrict our attention to the linearly 
restricted version of {}-LIG. 

Theorem 6 Each language in £^({}-LIG) can be rec- 
ognized in polynomial deterministic time. 

Outline of the proof. We extend the CKY parser for 
CFG. Let G be a {}-LIG in ETF. Since G may contain 
e-productions, the algorithm is adapted by letting the 
indices of the matrix refer to positions between sym- 
bols in the input string, not the symbols themselves. 
In order to account for the index multiset, we let the 
entries in the recognition matrix be pairs consisting of 
a nonterminal symbol and a |Vi|-tuple of integers: 

(A, (ni, . ..,n\ Vl \)) 

The |Vi|-tuple of integers represents a multiset, with 
each integer designating the number of copies of a given 
index symbol that the set contains. In an entry of 
the matrix, each pair represents a partial derivation 
of a substring of the input string. More precisely, if 
the input word is a\ ■ ■ -a n , and if V\ = {i\, . . - , *| v^i | } 5 
then we have (A, (ni, . . . , n^)) in entry tij of the 
recognition matrix if and only if there is a derivation 
As =>• Gij+i • • -a,j, where multiset s contains n^ copies 
of index symbol ik, 1 < k < |Vi|. Clearly, there 
is a derivation in the grammar if and only if entry 
to t n contains the pair (S, (0, . . . , 0)). Now since the 
grammar is linearly restricted, each n^ is bounded by 
n, and hence the number of different pairs is linearly 
bounded by | Vn | m ' 1/1 ' . Thus each entry in the matrix 
can be computed in 0(n 1+2 \ Vl \) steps, and since there 
are 0(n 2 ) entries, we get an overall time complexity of 
0(n 3 + 2 l^l). ■ 



UVG with Dominance Links 

We now formally define UVG with dominance links 
(UVG-DL), which serves as a formal model for the sec- 
ond and third phenomena introduced above, word order 
variation and quasi-trees. The definition differs from 
that of UVG only in that vectors are equipped with 
dominance relations which impose an additional condi- 
tion on derivations. Note that the definition refers to 
the notion of derivation tree of a UVG, which is defined 
as for CFG. 

Definition 5 An Unordered Vector Grammar 
with Dominance Links (UVG-DL,) is a J r tuple 
(Vn, Vt, V, S), where \% and Vt are sets of nonter- 
minals and terminals, respectively, S is the start sym- 
bol, and V is a set of vectors of context-free produc- 
tions equipped with dominance links. For a given vec- 
tor v G V, the dominance links form a binary relation 
dorrit, over the set of occurrences of non-terminals in 
the productions of v such that if dom v (A, B), then A 
(an instance of a symbol) occurs in the right-hand side 
of some production in v, and B is the left-hand symbol 
(instance) of some production in v. 

If G is a UVG-DL, L(G) consists of all words w £ 
which have a derivation g of the form 

S => Wl W'j • • • UV-1 =>■ w r = w, 

such that g meets the following two conditions: 

1- P1P2 ■ ■ -Pr ts a permutation of a member of V* . 

2. The dominance relations of V , when interpreted as 
the standard dominance relation defined on trees, 
hold in the derivation tree of g. 

The second condition can be formulated as follows: 
if v in V contributes instances of productions p\ and 
P2 (and perhaps others), and the kth daughter in the 
right-hand side of p\ dominates the left-hand nonter- 
minal of P2 , then in the context-free derivation tree as- 
sociated with g (the unique node associated with) the 
kth daughter node of p\ dominates (the unique node 
associated with) p2- We now give an example. (The 
superscripts distinguish instances of symbols and are 
not part of the nonterminal alphabet.) 
Example 2 

Let G 2 = (V N ,V T ,V, S') with: 



vi. {(S' — > dafi VP)} with dorrv = 

v 2 : {(VPW — ► NP nom VP< 2 )), (VP< 3 ) — ► NP dat VP< 4 )), (VP< 5 ) — ► VP< 6 ) VP< 7 )), (VP< 8 ) 
dom U2 = {(VP< 2 ), VP< 8 )), (VP< 4 ), VP< 8 )), (VP( 7 ),VP< 8 ))} 

v 3 : {(VP* 1 ) — > VPW VP< 2 )), (VP< 3 ) — > zu versuchen)} with dom„ 3 = {(VP< 2 ), VP< 3 ))} 

v 4 : {(VP* 1 ) — > NP acc VP< 2 )), (VP< 3 ) — > zu repaneren)} with dom„ 4 = {(VP< 2 ), VP< 3 ))} 

v$: {(NP nom — > der Meister)} with dom„ 5 = 

vq: {(NPdat — > memandem)} with dom„ 6 = 

V7: {(NP acc — y den Kuhlschrank)} with dom„, = 



verspricht)} with 



Figure 3: Definition of V for UVG-DL G 2 



S'(P,,) 



dass 



VP(p 21 ) 



NP(p 51 ) 



der Meister Np (P7i) 



\VP(p 41 ) 




<ien Kuehlschrank vp (P22- ) 



vp(p 42 ) 



NP(p 61 ) 



VP(P 23 ) \ 



zw repaneren 



niemandem vp (P32) vp (P24) 



zu versuchen verspricht 
Figure 4: Sample UVG-DL derivation 



V N = {S', VP, NP 

nom 1 NPdat, NP acc } 

Vt = {dafi, verspricht, zu versuchen, zu repaneren, 

der Meister, niemandem, den Kuhlschrank} 2 
V = {vi,v 2 , v 3 , v 4 , v 5 , v 6 , v 7 } 
where the i> 8 - are as defined in Figure 3. 

A sample derivation is shown in Figure 4, where the 
dominance relations are shown by dotted lines. Ob- 
serve that the example grammar is lexicalized. We will 
denote the class of lexicalized UVG-DL by UVG-DLL ex - 

It is clear that the dominance links of UVG-DL are 
the additional constraints that we argued above are nec- 
essary to adequately restrict the structural relation be- 
tween arguments and their verbs. Furthermore, UVG- 
DL is a notational variant of QTSG: every vector rep- 
resents a quasi-tree, and identifying quasi-nodes cor- 
responds to rewriting. The condition on a successful 
derivation in QTSG - that all nonterminal nodes be 
identified - corresponds to the definition of a derivation 
in UVG-DL. We have therefore found a mathematical 
model for the second and third phenomenon mentioned 

2 Gloss (in order): that, promises, to try, to repair, the 
master, no-one, the refrigerator. 



in Section 2. 

We now turn to the formal properties of UVG-DL. 
Our main result is that UVG-DL is weakly equivalent 
to {}-LIG. The sets of a {}-LIG implement the domi- 
nance links and make sure that all members from one 
set of rules are used during a derivation. We first in- 
troduce some more terminology with which to describe 
the derivations of UVG-DLs. If two productions p v \ 
and p Vy 2 from vector v are linked by a dominance link 
from a right-hand side nonterminal of p v \ to the left- 
hand nonterminal p v 2, then we will denote this link by 
h,i,2- We will say that p v \ (or the right-hand side non- 
terminal in question) has a passive dominance require- 
ment of l Vt \ t 2, and that p Vy 2 has an active dominance 
requirement of l Vt \ t 2- If Pv,i or p Vy 2 is used in a partial 
derivation such that the other production is not used 
in the derivation, the dominance requirement (passive 
or active) will be called unfulfilled. Let g be a (partial) 
derivation. We associate with g a multi-set which rep- 
resent all the unfulfilled active dominance requirements 
of g, written T(g). 

Theorem 7 £(UVG-DL) = £({}-LIG) 



Outline of the proof. The theorem is proved in two 
parts (one for each inclusion). We first show the inclu- 
sion £(UVG-DL) C £({}-LIG). Let G = (Vn, V T , V, S) 
be a UVG-DL, where V = {i>i,..., vk} with i> 8 - = 
(Pi,i, ■ ■ ■ ,Pi,ki), ki = \vi\, 1 < i < K. We construct 
a {}-LIG G' = (Vn, V t , V, P, S). Let V l = {l iJtk | 1 < 
i < K, 1 < j, k < hi}. Define P as follows. 

Let w in 7, and let p in v be the production A — > 
wqBi w\ ■ ■ ■ B n w n be in v r . In the following, we will 
denote by T(p) the multiset of active dominance re- 
quirements of p, and by -U(p) the multiset of passive 
dominance requirements of B>i, 1 < i < n. Add to P 
the following production: 

A T(p) — > w BiLi(p)wi ■ ■ -B n L n (p)w n 

P contains no other productions. We show by induc- 
tion that for A in Vn, and w in V^, we have A ==4>g w 
iff A ==4>g' w. Specifically, we show that for all integers 

k 

k, q : A =4>g w, w G V£ , with unfulfilled active domi- 
nance requirements T(g), implies that there is a deriva- 
tion AT(g) ==4>g' w, and, conversely, we show that for 

all integers k, At ==^g' ct, A E Vn, t a, multiset of ele- 
ments of Vi, and a £ Vrf, implies that there is a deriva- 
tion g : A ==4>g ol such that T(g) = t. 

For the inclusion £({}-LIG) C £(UVG-DL), we take 
a slightly different approach to avoid notational com- 
plexity. Let G = (Vn, Vt, Vi, P, S) be a {}-LIG in 
RINF. We construct a UVG-DL G' = (V N ,V T ,V,S), 
where V is defined as follows: 

1. If p £ P is a {}-LIG production of RINF type 1, then 

((p),0)e V. 

2. If p £ P is a {}-LIG production of RINF type 2, 
with p = A — > Bf for A, B £ Vn, / G Vi, then for 
all g £ P such that q = Cf — > D, v = ((A — > 
B,C — > D),dom v {B, C)) is in V. 

Let A be in Vn , and w in V£ . We show by induction 

that 5 1 ==4>g w iff S ==4>g' w. Specifically, we first show 
that for all integers k, for all {}-LIGs G and the corre- 
sponding UVG-DL G' as constructed above, if there is 
a derivation g : S {} ==4>g w with instances of ap- 
plications of rules of type 2, then there is a deriva- 
tion g' : S ==4>g' w such that g and g' are identical 
except for the index symbols in the sentential forms 
of g. For the converse inclusion, we show that for all 
integers k, for all {}-LIGs G and the correspond UVG- 
DL G' as constructed above, if there is a derivation 
g' : S {} ==4>g' w with k instances of applications of 
rules from vectors with two elements, then there is a 

derivation g : S ==4>g w such that g and g' are identical 
except for the index symbols in the sentential forms of 
g. ■ 

This equivalence lets us transfer results from {}-LIG 
to UVG-DL. It can easily be seen from the construction 
employed in the proof of Theorem 7 that a lexicalized 



UVG-DL maps to a linearly restricted {}-LIG. For lin- 
guistic purposes we are only interested in lexicalized 
grammars, and therefore the linear restriction is quite 
natural. We obtain the following corollaries thanks to 
Theorem 7. 

Corollary 8 £(UVG-DL Lex ) C £(CSG). 

Corollary 9 £(UVG-DL) is a substitution-closed full 
AFL. 

Corollary 10 Each language in £(UVG-DLL ex ) can 
be recognized in polynomial deterministic time. 

Related Formalisms 

Based on word-order facts from Turkish, Hoffman 
(1992) proposes an extension to CCG called {}-CCG, in 
which arguments of functors form sets, rather than be- 
ing represented in a curried notation. Under function 
composition, these sets are unioned. Thus the move 
from CCG to {}-CCG corresponds very much to the 
move from LIG to {}-LIG. We conjecture that (a ver- 
sion of) {}-CCG is weakly equivalent to {}-LIG. 

Staudacher (1993) defines a related system called dis- 
tributed index grammar or DIG. DIG is like LIG, except 
that the stack of index symbols can be split into chunks 
and distributed among the daughter nodes. However, 
the formalism is not convincingly motivated by the lin- 
guistic data given (which can also be handled by a sim- 
ple LIG) or by other considerations. 

Several extensions to {}-LIG and UVG-DL are de- 
fined in (Rambow, 1994). First, we can introduce the 
"integrity" constraint suggested by Becker et al. (1991) 
which restricts long-distance relations through nodes. 
This is necessary to implement the linguistic notion of 
"barrier" or "island". Second, we can define the tree- 
rewriting version of UVG-DL, called V-TAG. This is 
motivated by Conjecture 4, which (if true) means that 
UVG-DL cannot derive Swiss German. Under either ex- 
tension, the weak generative power is extended, but the 
formal and computational results obtained for {}-LIG 
and UVG-DL still hold. 

Conclusion 

This paper has presented two equivalent formalisms, 
{}-LIG and UVG-DL, which provide formal models for 
the three different phenomena that we identified in the 
beginning of the paper. We have shown that both for- 
malisms, under certain restrictions that are compati- 
ble with the motivating phenomena, are restricted in 
their generative capacity and polynomially parsable, 
thus making them attractive candidates for modeling 
natural language. Furthermore, the formalisms are 
substitution-closed AFLs, suggesting that the defini- 
tions we have given are "natural" from the point of 
view of formal language theory. 

Acknowledgments 

I would like to thank Bob Kasper, Gaelle Recource, 
Giorgio Satta, Ed Stabler, two anonymous reviewers, 



and especially K. Vijay-Shanker for useful comments 
and discussions. The research reported in this paper 
was conducted while the author was with the Com- 
puter and Information Science Department of the Uni- 
versity of Pennsylvania. The research was sponsored 
by the following grants: ARO DAAL 03-89-C-0031; 
DARPA N00014-90-J-1863; NSF IRI 90-16592; and Ben 
Franklin 91S.3078C-1. 

Bibliography 

Aho, A. V. (1968). Indexed grammars - an extension 
to context free grammars. J. ACM, 15:647-671. 

Becker, Tilman; Joshi, Aravind; and Rambow, Owen 
(1991). Long distance scrambling and tree adjoin- 
ing grammars. In Fifth Conference of the European 
Chapter of the Association for Computational Lin- 
guistics (EACL'91), pages 21-26. ACL. 

Chomsky, Noam (1981). Lectures in Government and 
Binding. Studies in generative grammar 9. Foris, 
Dordrecht. 

Cremers, A. B. and Mayer, O. (1973). On matrix lan- 
guages. Information and Control, 23:86-96. 

Cremers, A. B. and Mayer, O. (1974). On vector lan- 
guages. J. Comput. Syst. Set., 8:158-166. 

Gazdar, G. (1988). Applicability of indexed grammars 
to natural languages. In Reyle, U. and Rohrer, C, 
editors, Natural Language Parsing and Linguistic 
Theories. D. Reidel, Dordrecht. 

Hoffman, Beryl (1992). A CCG approach to free word 
order languages. In 30th Meeting of the Associa- 
tion for Computational Linguistics (ACL'92). 

Joshi, Aravind; Levy, Leon; and Takahashi, M (1975). 
Tree adjunct grammars. J. Comput. Syst. Set., 
10:136-163. 

Joshi, Aravind K. (1985). How much context- 
sensitivity is necessary for characterizing struc- 
tural descriptions — Tree Adjoining Grammars. 
In Dowty, D.; Karttunen, L.; and Zwicky, A., ed- 
itors, Natural Language Processing — Theoreti- 
cal, Computational and Psychological Perspective, 
pages 206-250. Cambridge University Press, New 
York, NY. Originally presented in 1983. 

Karttunen, Lauri (1989). Radical lexicalism. In Baltin, 
Mark and Kroch, Anthony S., editors, Alternative 
conceptions of phrase structure, pages 43-65. Uni- 
versity of Chicago Press, Chicago. 

Kasper, Robert (1992). Compiling head-driven phrase 
structure grammar into lexicalized tree adjoining 
grammar. Presented at the TAG+ Workshop, Uni- 
versity of Pennsylvania. 

Marcus, Mitchell; Hindle, Donald; and Fleck, Margaret 
(1983). D-theory: Talking about talking about 
trees. In Proceedings of the 21st Annual Meeting 
of the Association f or Computational Linguistics, 
Cambridge, MA. 



Pollard, Carl (1984). Generalized phrase structure 
grammars, head grammars and natural language. 
PhD thesis, Stanford University, Stanford, CA. 

Pollard, Carl and Sag, Ivan (1987). Information- 
Based Syntax and Semantics. Vol 1: Fundamen- 
tals. CSLI. 

Pollard, Carl and Sag, Ivan (1992). Anaphors in En- 
glish and the scope of binding theory. Linguistic 
Inquiry, 23(2):261-303. 

Pollard, Carl and Sag, Ivan (1994). Head-Driven 
Phrase Structure Grammar. University of Chicago 
Press, Chicago. Draft distributed at the Third Eu- 
ropean Summer School in Language, Logic and In- 
formation, Saarbriicken, 1991. 

Rambow, Owen (1994). Formal and Computational 
Models for Natural Language Syntax. PhD thesis, 
Department of Computer and Information Science, 
University of Pennsylvania, Philadelphia. 

Rambow, Owen and Satta, Giorgio (1994). A rewriting 
system for free word order syntax that is non-local 
and mildly context sensitive. In Martin- Vide, Car- 
los, editor, Current Issues in Mathematical Lin- 
guistics, North-Holland Linguistic series, Volume 
56. Elsevier-North Holland, Amsterdam. 

Satta, Giorgio (1993). Recognition of vector languages. 
Unpublished manuscript, Universita di Venezia. 

Shieber, Stuart B. (1985). Evidence against the 
context-freeness of natural language. Linguistics 
and Philosophy, 8:333-343. 

Staudacher, Peter (1993). New frontiers beyond 
context-freeness: Dl-grammars and Dl-automata. 
In Sixth Conference of the European Chapter 
of the Association for Computational Linguistics 
(E ACL '93). 

Steedman, Mark (1985). Dependency and coordination 
in the grammar of Dutch and English. Language, 
61. 

Vijay-Shanker, K. (1992). Using descriptions of trees in 
a Tree Adjoining Grammar. Computational Lin- 
guistics, 18(4):481-518. 

Vijay-Shanker, K. and Weir, David (1994). The equiva- 
lence of four extensions of context-free grammars. 
Math. Syst. Theory. Also available as Technical 
Report CSRP 236 from the University of Sussex, 
School of Cognitive and Computing Sciences. 

Vijay-Shanker, K.; Weir, D.J.; and Joshi, A.K. (1987). 
Characterizing structural descriptions produced by 
various grammatical formalisms. In 25th Meeting 
of the Association for Computational Linguistics 
(ACL '87), Stanford, CA. 



